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1.  Introduction 

We  prove  a  lower  bound  of  fl(n  log  n)  messages,  for  the  problem  of  electing  a  leader  in  a  ring  of  n 
processors.  The  processors  are  assumed  to  be  identical  except  that  each  has  its  own  unique  identifier.  Each 
processor  is  allowed  to  know  n,  the  size  of  the  ring.  Processors  can  communicate  only  with  their  neighbors  in 
the  ring,  by  sending  messages.  We  assume  that  execution  of  the  system  is  synchronous,  and  that  all 
processors  begin  their  algorithms  at  the  same  time.  Unlike  in  previous  work,  the  value  of  n  is  arbitrary,  and 
not  constrained  to  be  a  power  of  two.  We  prove  the  result  for  comparison  algorithms,  but  previously  known 
techniques  using  Ramsey's  theorem  show  that  the  result  holds  under  different  assumptions. 

The  problem  of  determining  the  number  of  messages  required  to  elect  a  leader  in  a  ring  has  attracted 
considerable  attention  [B,CR,DKR,FL,GHS,HS,IR,L,P,V].  There  are  deterministic  algorithms  for  the 
synchronous  model  [FL,V]  which  use  O(n)  messages,  but  which  require  large  amounts  of  time  and  which  use 
the  processor  ID’s  to  count.  The  best  deterministic  algorithms  which  use  comparisons  of  ll)’s  only  use  0(n 
log  n)  messages  [HS,GHS,B,DKR,P].  These  algorithms  work  for  both  the  synchronous  and  asynchronous 
models.  On  the  other  hand,  a  lower  bound  of  fl(n  log  n)  (with  constant  1/4)  has  been  proved  for  the  number 
of  messages  required  if  communication  is  asynchronous  [B],  If  communication  is  synchronous,  and  either  (a) 
the  algorithm  is  restricted  to  using  comparisons  on  ID'S  only,  or  (b)  the  algorithm  is  time-bounded 
independently  of  the  size  of  the  set  from  which  the  ID’s  are  chosen,  then  an  fi(n  log  n)  lower  bound  (with 
constant  1/2)  has  been  established  [FI],  However,  the  results  of  [FL]  require  the  severe  limitation  that  the 
value  of  n  is  a  power  of  2.  In  this  paper,  we  show  how  to  prove  an  fi(n  log  n)  lower  bound  for  the  same 
models,  but  for  arbitrary  values  of  n. 

Complexity  results  are  often  first  proved  for  powers  of  2,  and  then  extended  to  arbitrary  values  in  a 
straightforward  way.  The  lower  bound  proofs  of  [FL]  do  not  appear  to  extend  in  such  a  simple  manner. 
Those  lower  bounds  are  proved  for  a  particular  highly  symmetric  circular  configuration  of  ID’s,  where  the 
symmetry  is  based  on  having  a  number  of  processor  ID’s  that  is  precisely  a  power  of  2.  One  might  construct  a 
symmetric  configuration  for  a  power  of  2,  and  then  try  to  extend  it  is  some  way  for  the  extra  processors. 
However,  this  strategy  introduces  special  treatment  for  the  extra  processors.  This  special  treatment  might 
change  the  behavior  of  the  algorithm  entirely,  perhaps  allowing  some  processor  to  become  el  acted  easily. 

We  introduce  several  nice  techniques  beyond  those  in  [FL].  Two  of  these  techniques  are  of  particular 
interest.  First,  we  generalize  the  notion  of  "chains"  from  [FL],  as  a  way  of  describinj  limitations  on 
information  flow  during  a  computation.  Second,  we  develop  a  hierarchical  assignment  of  processor  ID’S  that 


creates  considerable  "replication  symmetry"  around  the  ring.  Such  an  assignment  can  be  produced  for 
arbitrary  values  of  n.  The  fact  that  there  are  rings  of  every  size  having  this  type  of  symmetry  seems  to  us  to  be 
quite  interesting. 

Our  new  lower  bound  still  requires  that  either  assumption  (a)  or  (b)  above  hold.  We  give  the  proof  for  case 
(a)  only,  with  the  result  for  case  (b)  following  as  in  [FL],  using  a  reduction  defined  via  Ramsey’s  theorem. 

The  constant  we  obtain  for  our  general  fl(n  log  n)  bound  is  1/(18  log2(9)),  which  is  much  smaller  than  the 
constant  of  1/2  obtained  in  [FL],  Thus,  for  the  special  case  where  n  is  a  power  of  2,  the  results  of  [FL]  still 
give  the  strongest  known  lower  bounds. 

The  remainder  of  the  paper  is  organized  as  follows.  Section  2  contains  the  formal  model  and  problem 
statement.  Section  3  contains  the  general  theory  used  to  prove  the  lower  bound  for  comparison  algorithms. 
Namely,  it  is  shown  that,  in  a  ring  with  certain  "replication  symmetry"  properties,  many  messages  are 
required  in  order  to  distinguish  certain  processors.  Section  4  contains  a  construction  of  a  ring  with  such 
replication  symmetry  properties.  Section  5  integrates  the  results  of  Sections  3  and  4,  thus  yielding  the  lower 
bound  result  for  comparison  algorithms.  As  a  corollary,  we  obtain  a  corresponding  lower  bound  for  time- 
bounded  algorithms.  Section  6  describes  remaining  questions. 

2.  Formal  Model  and  Problem  Statement 

In  this  section,  we  describe  the  formal  model  we  use  for  our  lower  bounds.  The  contents  of  this  section  are 
summarized  from  [FL],  and  the  interested  reader  is  referred  to  the  other  paper  for  additional  details. 

2.1 .  Algorithms 

All  processors  are  assumed  to  be  identical  except  that  each  has  its  own  unique  identifier,  chosen  from  an  ID 
space  X,  a  totally  ordered  set.  We  assume  that  all  processors  begin  their  election  algorithms  at  the  same 
time.  Each  processor  is  modelled  as  an  automaton  that  behaves  as  follows.  Initially,  the  processor  has  an  ID 
from  an  ID  space  X  recorded  in  its  state.  At  each  round,  the  processor  examines  its  state  and  decides 
whether  to  send  a  message  to  each  of  its  neighbors,  and  what  message  to  send.  Then  each  processor 
receives  any  messages  sent  to  it  in  that  round.  The  processor  uses  its  current  state  and  these  new  messages 
to  update  its  state.  Certain  of  the  states  are  designated  as  "elected"  states. 

It  may  be  assumed,  without  loss  of  generality,  that  algorithms  are  in  a  certain  normal  form.  In  this  normal 
form,  the  state  of  each  processor  records  exactly  its  initial  ID  and  the  history  of  messages  received,  and  each 
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message  which  is  sent  contains  the  entire  state  of  the  sending  processor.  We  represent  such  history 
informat;on  by  means  of  LISP  S-expressions.  The  S-expressions  that  arise  during  computation  are  of  a 
special  type,  which  we  will  call  the  well-formed  S-expressions  over  X.  The  atoms  are  x  €  X  and  NIL.  The 
well-formed  S-expressions  over  X  are  just  the  following:  (1)  the  elements  of  X,  and  (2)  these  of  the  form 
(s1,s2,s3),  where  s2  is  a  well-formea  S-expression  over  X,  and  s1  and  s3  are  either  well-formed  S-expressions 
over  X  or  NIL.  Let  T(X)  denote  the  set  of  well-formed  S-expressions  over  X. 

The  initial  states  will  just  be  the  ID’s  x  £  X.  Each  message  will  contain  exactly  the  state  of  the  sending 
processor.  When  a  processor  in  state  s  receives  messages  s1  and  s2  from  its  counterclockwise  and 
clockwise  neighbors  respectively,  its  new  state  will  be  the  S-expression  (s1,s,s2).  (If  no  message  is  received 
from  a  neighbor,  the  atom  NIL  is  used  in  place  of  s1  or  s2  as  a  placeholder.)  To  complete  the  algorithm 
specification,  we  define  a  function  which  determines  when  messages  are  to  be  sent  in  either  direction,  and  a 
designation  of  which  states  indicate  that  the  processor  has  been  elected.  Thus,  an  algorithm  over  X  is  a  pair 
(E,/x),  where  E  C  T(X)  is  the  set  of  elected  states,  and  p,  a  mapping  T(X)  x  {counterclockwise*, clockwise}  — » 
{yes.no},  is  the  message  generation  function.  We  assume  that  the  set  E  of  elected  states  is  "closed":  if  s  €  E 
and  sv  s2  €  T(X)  U  {NIL},  then  (s1,s,s2)  €  E.  Thus,  once  a  processor  has  been  elected,  it  will  remain  elected. 

2.2.  Executions 

To  facilitate  discussion,  we  index  the  processors  in  the  ring  clockwise,  as  0,...,n-1.  We  count  indices 
modulo  n.  A  ring  of  size  n  over  ID  space  X  is  an  n-tuple  of  elements  of  X,  giving  the  initial  ID's  of  the 
processors  0,...,n-1 ,  in  order.  A  configuration  of  size  n  is  an  n-tuple  of  S-expressions  in  T(X),  representing  the 
states  for  the  n  processors.  A  message  vector  of  size  n  is  an  n-tuple  of  ordered  pairs  of  elements  of  J(X)  U 
{null}.  It  represents  the  messages  sent  counterclockwise  and  clockwise  by  each  of  the  n  processors. 

An  execution  of  an  algorithm  for  ring  R  of  size  n  is  an  infinite  sequence  of  triples  (CyM,C2),  where  C1  and  C2 
are  configurations  and  M  is  a  message  vector,  all  of  size  n.  We  require  executions  to  satisfy  several 
properties.  First,  the  initial  configuration  must  be  R.  Second,  the  second  configuration  in  each  triple  must  be 
the  same  as  the  first  configuration  in  the  next  triple.  Finally,  each  triple  in  an  execution  must  describe  correct 
message  generation,  as  given  by  p,  and  correct  state  changes,  as  described  earlier.  An  execution  fragment 
is  any  finite  prefix  of  an  execution. 

We  now  define  our  complexity  measures.  We  measure  the  number  of  messages  sent  and  the  number,  of 
rounds  taken  only  up  to  the  point  where  a  processor  becomes  elected.  (This  convention  only  serves  to 
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strengthen  our  lower  bound.)  For  any  execution  e,  let  finishtime(e)  denote  the  number  of  the  first  round  after 
which  a  processor  has  entered  a  state  in  E.  Let  messages(e)  denote  the  number  of  messages  sent  during  e, 
up  to  and  including  round  finishtime(e). 

2.3.  Election  of  a  Leader 

Let  X  be  an  ID  space  with  |X|  >  n.  A  ring  algorithm  over  X  is  said  to  elect  a  leader  in  rings  of  size  n  provided 
that  in  each  execution,  e,  of  the  algorithm,  for  a  ring  R  of  size  n  over  X,  exactly  one  processor  eventually 
enters  a  state  in  E. 

2.4.  Comparison  Algorithms 

We  say  that  two  S-expressions,  s  and  s’,  over  X  are  order-equivalent  provided  that  they  are  structurally 
equivalent  as  S-expressions,  and  if  two  atoms  from  s  satisfy  one  of  the  order  relations  <,  =  or  >,  then  the 
corresponding  atoms  from  s’  satisfy  the  same  relation.  An  algorithm  is  a  comparison  algorithm  provided  that 
if  s  and  s'  are  order-equivalent  well-formed  S-expressions  over  X,  then  processors  with  states  s  and  s’ 
transmit  messages  in  the  same  direction  or  directions  and  have  the  same  election  status.  That  is, 
/i(s, counterclockwise)  =  a(s’, counterclockwise),  fi(s, clockwise)  =  /•  (s’, clockwise),  and  s  is  in  E  exactly  if  s’  is 
in  E. 

3.  Chains 

In  this  section,  we  describe  the  general  theory  needed  for  our  lower  bound  proof  for  comparison  algorithms. 
We  introduce  the  concept  of  a  "chain",  which  describes  information  flow  during  an  execution  of  a  ring 
algorithm.  The  notion  of  a  "chain"  used  in  this  paper  is  a  substantial  generalization  of  the  notion  of  a  "chain" 
used  for  a  similar  purpose  in  [FL].  For  comparison  algorithms,  we  show  that  nonexistence  of  certain  chains 
implies  that  certain  processors  in  a  ring  remain  indistinguishable.  Our  definition  of  a  chain  is  rather  unusual, 
but  turns  out  to  be  exactly  the  right  description  of  information  flow  to  prove  this  indistinguishability. 

3.1.  Basic  Definitions 

A  k -segment  of  a  ring  is  a  length  k  sequence  of  consecutive  processors  in  the  ring,  in  clockwise  order.  Let 
S  and  T  be  two  k-segments  in  a  ring,  with  first  processors  p  and  q  respectively  and  last  processors  p’  and  q' 
respectively,  and  let  e  be  an  execution  (or  execution  fragment)  of  an  algorithm  in  the  ring,  Then  a  clockwise 
chain  in  e  for  (S,T)  is  a  length  k  subsequence  of  the  steps  of  e,  e  ,  e  , ...,  e  ,  such  that  the  following  is  true. 

1  '2  'k 

In  each  step  e( ,  a  message  is  sent  either  by  processor  p  +  j  •  2  to  processor  p  +  j  •  1,  or  else  by  processor  q 
i 

+  j  -  2  to  processor  q  +  j  - 1 .  Thus,  a  clockwise  chain  for  a  pair  of  segments  describes  combined  information 
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flow  clockwise  in  the  two  segments,  from  outside  the  two  segments  up  to  the  last  processors  p’  and  q’.  A 

counterclockwise  chain  in  e  for  (S ,T)  is  defined  analogously,  for  information  flow  counterclockwise:  in  each 

step  e( ,  a  message  is  sent  either  by  processor  p’  •  j  +  2  to  processor  p’  -  j  +  1,  or  else  by  processor  q'  •  j  +  2 
i 

to  processor  q’  •  j  +  1, 

Two  length  k  vectors  of  X-elements  are  said  to  be  order-equivalent  provided  that  the  elements  in 
corresponding  positions  satisfy  the  same  ordering  relations  in  the  two  vectors.  (That  is,  if  the  wo  vectors  are 
a  and  b,  then  a.  and  a;  satisfy  the  same  relation,  <,  =  or  >,  as  b;  and  b.)  Two  segments  S  and  T  are  said  to  be 
order-equivalent  in  a  particular  ring  R  provided  that  the  sequences  of  initial  ID’s  of  the  processors  in  the  two 
segments  are  order-equivalent. 

Let  e  be  an  execution  fragment.  Then  maxcw(e)  is  defined  to  be  the  maximum  k  for  which  there  are 
order-equivalent  length  k  segments  S  and  T  (possibly  with  S  =  T),  such  that  e  contains  a  clockwise  chain  for 
(S,T).  The  quantity  maxccw(e)  is  defined  analogously.  Let  sum(e)  =  maxcw(e)  +  maxccw(e). 

3.2.  Limitations  on  Chains 

First,  it  is  obvious  that  a  length  0  execution  e  has  maxcw(e)  =  maxccw(e)  =  sum(e)  *  0. 

The  following  lemma  limits  the  growth  of  chains  at  a  single  Step  of  an  execution. 

Lemma  1 :  Let  e  be  an  execution  fragment  for  a  ring  R,  and  e'  another  execution  fragment 
consisting  of  all  but  the  last  step  of  e.  Then  (a)  maxcw(e)  <  maxcw(e’)  +  1 ,  with  maxcw(e)  * 
maxcw(e’)  if  no  messages  are  sent  clockwise  at  the  last  step  of  e,  and  (b)  maxccw(e)  < 
maxccw(e')  +  1 ,  with  maxccw(e)  =  maxccw(e')  if  no  messages  are  sent  counterclockwise  at  the 
last  step  of  e. 

Proof:  We  argue  part  (a).  Part  (b)  is  analogous.  The  second  half  of  the  claim  is  obvious.  We 
argue  the  inequality  maxcw(e)  <  maxcw(e')  +  1.  We  may  assume  that  maxcw(e)  >  1,  since 
otherwise  the  result  is  obvious. 

Let  S  and  T  be  order-equivalent  segments  of  length  maxcw(e)  for  which  there  is  a  clockwise 
chain  in  e.  Let  S’  and  T*  be  the  segments  of  length  maxcw(e)  •  1  consisting  of  all  but  the  last 
processor  in  S  and  T  respectively.  Then  S'  and  T'  are  order-equivalent.  Moreover,  since  only  the 
last  message  in  the  chain  could  have  been  sent  at  the  last  step  of  e,  it  must  be  that  e'  contains  a 
clockwise  chain  for  (S',T').  Thus,  maxcw(e’)  >  maxcw(e)  •  1 ,  as  required.  I 


3.3.  Bisegments 

In  the  next  subsection,  we  will  show  that,  for  comparison  algorithms,  limitations  on  chains  in  an  execution 
imply  limitations  on  distinguishability  of  processors.  In  order  to  state  those  results,  we  require  some 
additional  definitions. 

If  k1  and  k2  are  positive  integers,  a  {kyk^-bi segment  is  defined  to  be  a  pair  of  segments,  the  first  of  size  kt 
and  the  second  of  size  k2,  which  overlap  in  a  single  processor  (the  last  processor  of  the  first  segment  and  the 
first  of  the  second  segment).  The  processor  which  appears  in  both  segments  is  called  the  center  of  the 
bisegment.  The  spanning  segment  of  a  bisegment  is  the  segment  obtained  by  concatenating  the  two 
segments  in  the  bisegment,  and  removing  the  duplicated  center.  Two  bisegments  are  said  to  be 
order-equivalent  in  a  particular  ring  provided  their  spanning  segments  are  order-equivalent.  Two  processors 
p  and  q  are  ( k^,k2)-equivalent  in  a  particular  ring  provided  that  their  (k1,k2)-bisegments  (i.e.  the 
(k1tk2)-bisegments  centered  at  p  and  q)  are  order-equivalent. 

Let  S  «  (S1  ,S2)  and  T  =  (TVT2)  be  two  (k1,k2)-bisegments,  e  an  execution  or  execution  fragment.  Then  a 
clockwise  chain  in  e  for  (S,T)  is  a  clockwise  chain  in  e  for  (S^T^,  and  a  counterclockwise  chain  for  (S,T)  is  a 
counterclockwise  chain  for  (S2,T2).  A  chain  in  e  for  (S,T)  is  either  a  clockwise  chain  or  a  counterclockwise 
chain  for  (S,T). 

3.4.  Indistinguishability 

in  this  subsection,  we  give  a  lemma  which  shows  that,  for  comparison  algorithms,  absence  of  certain  chains 
implies  that  certain  processors  must  remain  "indistinguishable". 

Our  notion  of  "indistinguishability"  is  defined  as  follows.  If  S  and  T  are  two  ID  sequences,  each  of  length  k, 
and  s  and  t  are  two  S-expressions,  then  s  is  congruent  to  t  with  respect  to  (S,T)  provided  that  s  and  t  are 
structurally  equivalent,  and  corresponding  positions  in  s  and  t  contains  elements  from  corresponding 
positions  of  S  and  T,  respectively.  If  S  and  T  are  two  segments  of  a  particular  ring,  then  s  and  t  are  congruent 
with  respect  to  (S,T)  provided  that  s  and  t  are  congruent  with  respect  to  the  corresponding  sequences  of  ID's. 
Similarly,  if  S  and  T  are  two  bisegments  of  a  ring,  we  say  that  s  and  t  are  congruent  with  respect  to  S  and  T 
provided  that  they  are  congruent  with  respect  to  their  spanning  segments. 

Lemma  2:  Let  p  and  q  be  (k1tk2)-equivalent  processors  in  a  ring  R,  and  let  S  and  T  be  their 
respective  {krk2)-bisegments.  Let  e  be  an  execution  fragment  of  a  comparison  algorithm  for  R.  If 
there  are  no  chains  in  e  for  (S,T),  then  at  the  end  of  e,  the  states  of  p  and  q  are  congruent  with 
respect  to  (S,T). 


Proof:  The  proof  is  by  induction  on  the  length  of  e. 

Base:  |e|  =  0.  Neither  p  nor  q  has  received  any  messages  in  e,  so  they  will  remain  in  states  which 
are  congruent  with  respect  to  (S ,T). 

Inductive  step:  |e|  >  0,  and  the  result  holds  for  any  execution  fragment  of  length  shorter  than  |e| 
and  any  values  of  k1  and  k?.  Let  e’  denote  e  except  for  its  last  step.  Then  by  inductive  hypothesis, 
p  and  q  remain  in  states  which  are  congruent  with  respect  to  (S,T)  up  to  the  end  of  e’.  Consider 
what  happens  at  the  last  step.  Let  p'  and  q'  be  the  respective  counterclockwise  neighbors  of  p 
and  q,  and  p"  and  q"  the  respective  clockwise  neighbors. 

Case  1:  Both  of  the  following  hold:  (a)  Either  p’  and  q1  are  in  states  which  are  congruent  with 
respect  to  (S,T)  just  after  e\  or  else  neither  p’  nor  q’  sends  a  message  clockwise  at  the  last  step  of 
e.  (b)  Either  p”  and  q"  are  in  states  which  are  congruent  with  respect  to  (S,T)  just  after  e’,  or  else 
neither  p”  nor  q"  sends  a  message  counterclockwise  at  the  last  step  of  e. 

In  this  case,  it  is  easy  to  see  that  p  and  q  remain  in  states  which  are  congruent  with  respect  to 
(S,T),  after  e.  For  if  p’  and  q’  are  in  states  which  are  congruent  with  respect  to  (S,T)  just  after  e’, 
then  since  the  algorithm  is  a  comparison  algorithm,  they  both  make  the  same  decision  about 
whether  or  not  to  send  a  message  clockwise  at  the  last  step  of  e.  If  they  both  send  a  message, 

Then  the  messages  they  send  are  just  their  respective  states,  which  are  congruent  with  respect  to 
(S,T).  A  similar  argument  applies  to  p”  and  q”.  It  follows  that  p  and  q  remain  in  states  which  are 
congruent  with  respect  to  (S,T)  after  the  last  step  of  e. 

Case  2:  Processors  p’  and  q’  are  in  states  which  are  not  congruent  with  respect  to  (S,T)  just  after 
e’,  and  at  least  one  of  them  sends  a  message  clockwise  at  the  last  step  of  e. 

If  k1  =  1  (i.e.  if  p  and  q  are  at  the  counterclockwise  ends  of  their  respective  bisegments),  then  a 
clockwise  chain  for  (S,T)  is  produced  by  the  message  sent  at  the  last  step,  a  contradiction.  So 
assume  that  k1  >  1.  Since  p  and  q  are  (k.,,k2)-equivalent,  it  follows  that  p’  and  q’  are 
(k1-1,k2+  l)-equivalent.  Let  S’  and  T’  denote  their  respective  (k1-1  ,k2  +  1)-bisegments.  S’  and  T’ 
contain  exactly  the  same  processors  as  S  and  T  respectively,  but  are  centered  at  p’  and  q’  rather 
than  p  and  q.  Since  the  states  of  p’  and  q'  just  after  e'  are  not  congruent  with  respect  to  (S,T),  they 
are  also  not  congruent  with  respect  to  (S',T’).  By  the  inductive  hypothesis,  there  must  be  a  chain 
in  e’  for  (S’,T’>.  If  there  is  a  counterclockwise  chain  in  e’  for  (S’,T’),  then  it  is  also  a 
counterclockwise  chain  for  (S,T),  so  there  is  a  counterclockwise  chain  in  e  for  (S,T).  On  the  other 
hand,  if  there  is  a  clockwise  chain  in  e'  for  (S',T'),  then  since  at  least  one  of  p’  and  q'  sends  a 
message  clockwise  at  the  last  step  of  e,  we  obtain  a  clockwise  chain  in  e  for  (S,T).  Either  case  is  a 
contradiction. 

Case  3:  Processors  p”  and  q"  are  in  states  which  are  not  congruent  with  respect  to  (S,T)  just 
after  e',  and  at  least  one  of  them  sends  a  message  counterclockwise  at  the  last  step  of  e.  The 
argument  is  analogous  to  the  one  for  Case  2.  I 

Thus,  we  have  shown  that  absence  of  certain  chains  implies  that  certain  processors  will  remain  congruent. 


wo  corollaries  which  will  be  used  in  our  lower  bound  proof  follow  from  this  lemma.  The  first  one  says  that, 
hen  chains  are  short  and  there  are  lots  of  equivalent  processors,  any  message  which  gets  sent  has  many 
jrresponding  messages  sent  at  the  same  time  by  other  processors. 

Corollary  3:  Let  k  be  a  positive  integer.  Assume  ring  R  is  such  that  every  k-segment  has  at  least 

1  order-equivalent  k-segments.  Let  e  be  any  execution  fragment  of  a  comparison  algorithm  in  R,  e‘ 
be  another  fragment  consisting  of  all  but  the  last  step  of  e,  and  assume  that  sum(e’)  <  k.  If  some 
processor  p  sends  a  message  clockwise  (or  counterclockwise)  at  the  last  step  of  e,  then  there  are 
at  least  i  processors  that  do  the  same. 

Proof:  Consider  the  case  where  p  sends  a  message  clockwise.  The  other  case  is  analogous. 

Let  k1  =  maxcw(e')  +  1  and  k2  =  maxccw(e’)  +  1.  The  (kvk2)-bisegment  for  p  has  at  most  k 
elements,  so  that  p  has  at  least  i  (k^k^-equivalent  processors.  Let  q  be  any  one  of  these 
processors,  and  let  S  and  T  be  the  (krk2)-bisegments  centered  at  p  and  q,  respectively.  Then 
there  cannot  be  a  chain  in  e'  for  (S,T),  by  the  definitions  of  maxcw  and  maxccw.  But  then  Lemma 

2  implies  that  p  and  q  remain  congruent  with  respect  to  (S,T)  at  the  end  of  e’;  since  the  algorithm  is 
a  comparison  algorithm,  q  also  sends  a  message  clockwise  at  the  last  step  of  e.  I 

?mma  2  also  has  the  following  consequence  for  comparison  algorithms  to  elect  a  leader.  This  corollary  says 
at  long  chains  must  be  generated  in  order  to  elect  a  leader,  if  certain  equivalent  processors  exist. 

Corollary  4:  Let  k  be  a  positive  integer.  Let  R  be  a  ring  in  which  every  k-segment  S  has  another 
order-equivalent  k  segment  T.  Let  e  be  any  execution  fragment  of  a  comparison  algorithm  which 
elects  a  leader  in  R,  such  that  a  leader  gets  elected  in  e.  Then  sum(e)  >  k. 

Proof:  Assume  not  -  that  sum(e)  =  maxcw(e)  +  maxccw(e)  <  k.  Let  k1  =  maxcw(e)  +  1  and  k2 
=  maxccw(e)  +  1.  The  (krk2)  bisegment  for  the  processor  p  that  gets  elected  leader  has  at  most 
k  elements,  so  that  p  has  a  (krk2)-equivalent  processor  q  *  p;  let  S  and  T  be  the 
(krk2)-bisegments  centered  at  p  and  q,  respectively.  Then  there  cannot  be  a  chain  in  e  for  (S,T), 
by  the  definition  of  maxcw  and  maxccw.  But  then  Lemma  2  implies  that  p  and  q  remain  congruent 
with  respect  to  (S,T);  since  the  algorithm  is  a  comparison  algorithm,  p  and  q  cannot  be 
distinguished  as  to  leadership.  This  is  a  contradiction.  I 


.  Replication  Symmetry  in  Rings 

n  this  section,  we  show,  for  each  n,  how  to  construct  a  ring,  Rn,  consisting  of  n  ID's  chosen  from  a 
irticular  ID  space.  The  ring  Rn  is  constructed  to  have  a  large  amount  of  replication  symmetry.  That  is, 
gments  in  Rn  have  many  order-equivalent  segments.  In  the  next  section,  we  will  show  that  this  replication 
mmetry  causes  the  Rn  to  require  a  large  number  of  messages  for  election  of  a  leader. 
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1 .  Hierarchical  Organization  of  Processors 

rix  a  particular  ring  size  n  >  1.  Rn  will  be  constructed  by  first  constructing  a  sequential  pattern  Pn  and  then 
innecting  its  ends  to  form  a  ring.  The  construction  of  Pn  groups  the  processors  using  a  hierarchy  of  depth 
where  d  =  L  (loggn)/2  J.  We  describe  the  grouping  using  a  derivation  tree  of  a  context-free  grammar, 
iter,  we  will  use  the  structure  of  the  derivation  tree  to  assign  ID's  to  the  n  leaves  of  the  tree  and  thereby 
oduce  pattern  Pn. 


Define  the  context-free  grammar  G  as  follows.  The  nonterminals,  representing  groups  of  processors,  are  A. 
id  Bj,  1  <  i  <  j,  plus  Bq.  There  is  just  one  terminal  symbol,  p,  representing  a  processor.  The  start  symbol  is 
r  The  productions  are: 


B  ,A 

i  +  i  i  - 


A  .B  B  A  A.  .B.  ,B.  ,,  forO  <  i  <  d  -  1, 

1  1+1  1+1  1+1  1+1  1+1  1  +  1  t  + 1 *  —  —  ’ 


A  — +  A  B  B  A  A.  B  ,B.  „A.  ,A.  ,,  for  1  <  i  <  d  -  1, 

i  i  +  1  i+;  i  +  1  i  +  1  i  +  1  i  +  1  i  +  1  i  +  i  i  + 1 1  —  — 


Bd  ->  p  o,  and  Ad  -+  pad. 


(That  is,  Bd  generates  a  string  consisting  of  bd  p  symbols,  and  analogously  for  Ad.)  The  quantities  ad  and  bd 
ill  be  defined  later,  in  such  a  way  as  to  guarantee  that  the  length  of  the  unique  sentence  generated  by  G  is  n. 


For  each  i,  0  <  i  <  d,  define  the  level  i  sentential  form  of  G  to  be  the  unique  string  over  {A.,B.}  derivable  in 
i.  There  are  exactly  91  nonterminal  symbols  in  the  level  i  sentential  form.  Moreover,  for  each  i,  the  number  of 
ymbols  A.  is  exactly  one  less  that  the  number  of  symbols  B.. 

Lemma  5:  In  the  level  i  sentential  form  of  G,  0  <  i  <  d,  the  number  of  symbols  A.  is  L  9'/2  J,  and 
the  number  of  symbols  B.  is  T  9'/2  1. 

Proof:  By  induction  on  i.  I 


All  A(  nodes  derive  a  terminal  string  of  the  same  length;  let  us  call  this  length  a(.  Similarly,  all  Bj  nodes  derive 
terminal  string  of  the  same  length,  which  we  will  call  bj.  Let  c{  =  min(aj,bj),  for  all  i,  1  <  i  <  d. 

We  next  describe  how  to  select  the  values  ad  and  bd.  They  are  chosen  in  such  a  way  that  the  total  length  of 

ie  unique  sentence  derived  in  G  is  exactly  n,  and  so  that  |bd  -  ad|  is  small.  We  use  the  following. 

Lemma  6:  Let  m,  n  >  0  be  integers.  Then  there  are  integers  a  and  b  such  that  n  =  am  +  b(m  + 

1)  and  |b  -  a|  <  m. 

Proof:  Fix  m.  if  m  =  0,  then  a  =  b  =  n  suffices,  so  assume  that  m  >  1.  We  proceed  by 
induction  on  n. 


Basis:  n  =  0.  Then  a  =  b  =  0  suffices. 


Inductive  step:  Assume  that  n  =  am  +  b(m  +  1)  and  |b  -  a|  <  m.  We  will  produce  a’  and  b’  such 
thatn  +  1  =  a’m  +  b’(m  +  1)  and  |b’  -  a’|  <  m.  There  are  two  cases: 

Case  1.  b  •  a  <  m  •  2.  Then  let  a’  =  a  - 1  and  b'  =  b  +  1.  The  equation  is  satisfied,  and  b’  •  a’  • 
b  -  a  +  2.  Then  b’  •  a'  >  b  -  a  >  -m,  and  b'  -  a'  <  (m  -  2)  +  2  =  m,  as  needed. 

Case  2.  b  •  a  >  m  •  1 . 

Then  let  a’  =  a  +  m  and  b’  =  b  -  m  +  1.  The  equation  is  satisfied,  and  b'  -  a'  *  b  •  a  -  2m  +  1. 
Then  b’  •  a’  >  m  •  1  -  2m  +  1  =  -m,  and  b’  •  a'  <  b  -  a  since  m  >  1.  Thus,  b'  •  a’  <  m,  as  needed. 


Let  m  =  L  9d/2  J.  It  is  easy  to  see  that  m  is  9(n1/2),  and  in  particular,  that  m  <  n,/2/2.  Using  Lemma  6, 
choose  ad  and  bd  to  be  integers  such  that  n  =  adm  +  bd(m  +  1)  and  |bd  -  aj  <  m.  We  must  show  that  ad  and 
bd  are  nonnegative:  if  either  of  ad  and  bd  is  negative,  then  max(ad,bd)  <  m  - 1,  so  n  *  adm  +  bd(m  +  1)  < 
2(m2)  <  n/2,  a  contradiction. 

Lemma  7:  The  length  of  the  unique  sentence  generated  by  G  is  n. 

Proof:  By  Lemma  5,  there  are  exactly  L  9d/2  J  =  m  symbols  Ad,  and  exactly  T  9d/2  1  =  m  +  1 
symbols  Brf  in  the  level  d  sentential  form  of  G.  Since  n  =  adm  +  bd(m  +  1),  the  result  holds.  I 

We  have  already  noted  that  m  is  0(n1/2).  Since  ad  is  nonnegative,  we  have  that  n  >  bd(m  + 1).  Using  the 
lower  bound  on  m,  we  see  that  brf  is  0{n1/2). 

The  final  lemma  of  this  subsection  gives  the  exact  value  of  the  difference  c,  •  c|+ v 

Lemma  8:  The  difference  Cj  ■  cj  +  1  =  4  •  9d‘*l  +  1^(n  •  bd)/m,  forO  <  i  <,  d  •  1. 

Proof:  We  first  show  that  if  a.  +  1  <  bj  +  1,  then  a,  <  b,.  This  follows  since  a.  =  5aj  +  1  +  4b, +1  < 

4a,  +  5bj  +  1  =  b.  A  similar  fact  holds  if  bui  <  a. +  1.  Next  we  note  that  a.  •  a, +1  =  b,-bu1  * 

4(aj+1  +  bj  +  1).  Thus  no  matter  which  of  a, +  1  and  bj+1  is  smaller,  Cj-c.  +  1  =  4(aj+1  +  bu1). 

From  the  choice  of  ad  and  bd,  we  have  ad  +  bd  =  (n  •  bd)/m.  It  follows  that  a(+1  +  b,+1  « 

9d  ('  +  h(n  .  bd)/m.  Substituting  into  the  expression  for  c,  •  c.  + 1  gives  the  desired  result.  I 


4.2.  Labelling  of  Processors 

Let  X  be  the  ID  space  consisting  of  all  strings  of  length  d  +  1  whose  elements  are  nonnegative  integers, 
with  the  strings  ordered  lexicographically;  X  is  the  ID  space  from  which  the  pattern  Pn  will  be  constructed. 

We  define  Pn  by  describing  an  assignment  of  ID's  to  n  processors,  corresponding  to  the  leaves  of  the 
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derivation  tree  of  G.  In  order  to  do  this,  we  associate  labels  with  the  nodes  of  the  derivation  tree.  The  label  of 
the  root  of  the  tree  is  the  null  string.  If  a  node  with  a  corresponding  nonterminal  A(  or  B.,  0  <  i  <  d  - 1,  is 
labelled  by  the  string  w,  then  the  labels  of  its  n:->e  children  are  respectively  wO,  wl,  w2,  w3,  w8,  w7,  w6,  w5, 
w4.  If  a  node  with  a  corresponding  nonterminal  Ad  is  labelled  by  the  string  w,  then  the  labels  of  its  ad  children 

are  respectively  wO,  wl . w(ad  -1).  If  a  node  with  a  corresponding  nonterminal  Bd  is  labelled  by  the  string  w, 

then  the  labels  of  its  bd  children  are  respectively  wO,  wl,...,  w(bd  -  1).  Processor  ID’s  are  generated  by 
interpreting  the  labels  of  the  leaves  as  elements  of  X,  i.e.  as  length  d  +  1  strings  of  nonnegative  integers, 
ordered  lexicograpically. 

In  the  level  i  sentential  form  of  G,  define  an  ordered  pair  of  nonterminal  symbols  to  be  "of  type  A>A" 
provided  that  it  consists  of  the  two  symbols  A.  A.,  and  the  label  of  the  node  of  the  first  nonterminal  is 
lexicographically  greater  than  that  of  the  second.  We  use  analogous  definitions  for  types  A<A,  A>B,  A<B, 
B>A,  B<A,  B>B  and  B<B. 


We  now  show  that  the  level  i  sentential  form  has  equal  numbers  of  consecutive  pairs  of  nonterminals  of  the 
eight  possible  types. 

Lemma  9:  In  the  level  i  sentential  form  of  G,  0  <  i  <  d,  the  number  of  occurrences  of 
consecutive  pairs  of  each  of  the  eight  types  A>A,  A<A,  A>B,  A<B,  B>A,  B<A,  B>B  and  B<B  is  exactly 
L  978  J. 

Proof:  It  suffices  to  show  that  the  numbers  of  occurrences  of  the  eight  types  of  pairs  are  equal, 
since  the  total  number  of  pairs  is  exactly  9'  •  1  =  8  L  978  J.  We  proceed  by  induction  on  i.  For  the 
basis,  i  =  0,  the  result  is  vacuously  true.  Assume  that  the  result  is  true  for  i,  and  consider  the  level 
i  +  1  sentential  form.  There  are  two  kinds  of  pairs  of  level  i  + 1  nonterminals:  those  in  which  both 
elements  are  derived  from  the  same  level  i  nonterminal  node,  and  those  in  which  the  two  elements 
are  derived  from  two  different  level  i  nonterminal  nodes.  Each  level  i  nonterminal  node  generates 
a  length  9  sequence  of  level  i  +  1  nonterminals,  in  which  each  of  the  eight  types  of  pairs  has 
exactly  one  occurrence.  Therefore,  there  are  equal  numbers  of  the  eight  possible  types  among 
the  pairs  which  are  derived  from  the  same  level  i  nonterminal  node.  Also,  each  pair  which  is 
derived  from  two  different  level  i  nonterminal  nodes  is  of  the  same  type  as  the  corresponding  pair 
of  parent  nodes;  the  inductive  hypothesis  implies  that  there  are  equal  numbers  of  the  eight 
possible  types  among  these  pairs,  as  well.  The  result  follows.  I 


In  any  level  i  sentential  form,  note  that  the  pair  consisting  of  the  last  nonterminal  node  followed  by  the  first 
nonterminal  node,  is  of  type  B>B. 
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4.3.  Replication  Symmetry 

The  ring  Rn  is  constructed  by  folding  the  sequential  pattern  Pn  constructed  in  the  preceding  subsections 
into  a  ring;  the  sequential  pattern  Pn  is  arranged  around  the  circle  in  clockwise  order.  In  this  subsection,  we 
state  a  lemma  which  describes  the  replication  symmetry  of  R  .  This  lemma  will  be  used  in  the  next  section,  to 

yield  our  lower  bound  for  the  number  of  messages  required  by  a  comparison  algorithm  to  elect  a  leader. 

Lemma  10:  Let  1  <  i  <  d.  Let  S  be  any  segment  of  Rn  of  length  at  most  c-t  +  1.  Then  there  are 
at  least  L  9'/8  J  segments  of  Rp  that  are  order-equivalent  to  S,  including  S  itself. 

Proof:  Let  S  be  a  segment  of  R  ,  of  length  at  most  C;  +  1.  Then  S  is  contained  in  the  subtrees  of 
at  most  two  nonterminal  nodes  at  level  i.  These  two  are  either  two  consecutive  nonterminal  nodes, 
or  else  the  last  and  first  nonterminals  in  the  sentential  form.  Let  t  be  the  type  of  this  ordered  pair  of 
nonterminal  nodes. 

By  Lemma  9,  there  are  at  least  L  9'/8  J  instances  of  type  t  consecutive  pairs  of  nonterminal 
nodes  in  the  level  i  sentential  form.  Each  of  these  instances  of  a  pair  of  type  t  contains  a  segment 
which  is  order-equivalent  to  S. 


5.  Lower  Bound 

In  this  section,  we  state  and  prove  our  lower  bound  for  the  number  of  messages  required  by  a  comparison 
algorithm  to  elect  a  leader.  We  use  the  symmetric  rings  Rn  constructed  in  the  previous  section,  and  the  two 
corollaries  from  Section  3. 

Theorem  1 1 :  Let  J.  be  a  comparison  algorithm  over  an  arbitrary  ID  space,  X,  which  elects  a 
leader  in  rings  of  size  n.  Then  there  is  an  execution,  e,  of  J.  for  which  messages(e)  >  B(n  log  n). 

Proof:  Assume  n  is  fixed,  and  at  least  94.  This  ensures  that  the  depth  d  =  L(loggn)/2J  is  at  least 

2.  It  suffices  to  consider  the  ID  space  X  consisting  of  length  d  +  1  strings  of  nonnegative  integers, 
ordered  lexicographically.  Assume  the  ring  Rn  is  used.  Let  e  be  the  execution  fragment  for  Rn, 
which  terminates  just  when  the  elected  processor  enters  an  "elected"  state.  By  Lemma  10,  every 
segment  of  length  c2  +  1  has  at  least  one  other  order-equivalent  segment  in  Rn.  (The  Lemma 
actually  implies  that  there  are  at  least  nine  others,  but  we  do  not  require  this  fact  here.)  Thus,  by 
Corollary  4,  execution  e  must  progress  from  having  a  sum  of  0  to  having  a  sum  of  at  least  ca  +  1. 

Consider  any  step  of  e  at  which  the  sum  first  stops  being  at  most  k,  for  any  k  <  cr  By  Lemma  9, 
the  sum  increases  by  at  most  2  at  this  step.  Moreover,  if  no  messages  are  sent  clockwise  (resp., 
counterclockwise)  at  this  step,  then  the  sum  increases  by  at  most  1. 

Let  e’  be  the  prefix  of  e  preceding  this  step.  Then  sum(e')  <  C|  +  1.  Lemma  10  implies  that  any 
segment  of  length  c  +  1  has  at  least  L  978  J  order-equivalent  segments  in  R  .  Thus  by  Corollary 

3,  if  any  messages  are  sent  clockwise  at  this  step,  then  at  least  L  978  J  messages  are  sent 
clockwise,  and  similarly  for  messages  sent  counterclockwise.  Thus,  if  the  sum  increases  by  1  at 
this  step,  at  least  L  978  J  messages  are  sent,  while  if  the  sum  increases  by  2  at  this  step,  then  at 


least  twice  that  number  of  messages  are  sent.  It  follows  that  the  cost  of  increasing  the  sum  from  0 
to  at  least  c2  +  1  can  be  apportioned  as  a  cost  of  at  least  L  9'/8  J  for  each  increase  from  k  to  k  + 

1 ,  where  k  <.  c,. 

We  now  total  up  the  number  of  messages  sent  in  e.  Grouping  increases  according  to  level,  we 
see  that  the  number  of  messages  sent  must  be  at  least 

Si  =  2 . d-1L9i/8J(Ci-Ci  +  l)- 

By  Lemma  8,  this  quantity  is  equal  to 

Sj  =  2 . d.1L9i/8J(4*9d<i  +  1)(n-bd)/m) 

*  4  {(n  •  bd)/m )  2.  =  2 . ^  L  978  J  9d  (l  + 1} 

>  4  ((n  -  bd)/m )  [2.  =  2 . ^  9l/Q  9d(i  +  •  2.  s  2 . ^  9d‘(i  * 1}). 

The  first  summation  evaluates  to  (d-2)  9d‘V8,  while  the  second  is  bounded  above  by  9d'2/8. 
Thus,  the  message  bound  is  at  least 

4  ((n  -  bd)/m  )  f(d-2)  9dV8  -  9d‘2/8). 

Since  m  =  L  9d/2  J  <  9d/2,  this  is  at  least 

8  ((n  -  bd)/  9d)  [(d-2)  9d  V8  -  9d'2/8] 

=  (n  •  bd)  [(d-2)/9  - 1/81]  =  (n  -  bd)  [d/9  -  0(1)). 

Since  bd  is  0(n1/2),  the  message  bound  is  at  least 

=  (n  -0(n1/2))  [d/9  0(1)] 

=  (n  -  0(n1/2))  [((1/2)loggn)/9  -  0(1)) 

=  n((1/2)loggn)/9-0(n) 

a  (nlog2n)/(l8log29)-0(n).  I 

We  have  just  shown  that  algorithms  which  are  restricted  to  use  only  comparisons  of  ID’s  require  at  least  0(n 
log  n)  messages  in  the  worst  case.  We  now  state  a  corollary  which  shows  that  the  given  lower  bound  applies 
for  time-bounded  algorithms  as  well.  More  specifically,  we  conclude  that  there  is  a  (very  fast-growing) 
function  f  with  the  following  property.  If  the  time  is  required  to  be  bounded  by  some  t  in  the  worst  case,  and 
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ID's  are  chosen  from  any  ID  space  having  at  least  f(n,t)  elements,  then  any  algorithm  requires  Q(n  log  n) 
messages  in  the  worst  case. 

Corollary  1 2:  There  exists  a  function  f  such  that  for  any  n,  t,  the  following  holds.  Let  X  be  an 
arbitrary  ID  space  with  at  least  f(n,t)  elements.  Let  A  be  any  algorithm  over  X  which  elects  a  leader 
in  rings  of  size  n,  and  for  which  finishtime(e)  <  t  for  all  executions  e  for  rings  of  size  n.  Then  there 
is  an  execution,  e,  of  A  for  which  messages(e)  is  fl(n  log  n). 

Proof:  The  proof  is  analogous  to  a  similar  one  in  [FL].  It  is  based  on  a  transformation  from 
time-bounded  algorithms  to  comparison  algorithms,  using  Ramsey’s  theorem.  (The  idea  for  the 
transformation,  presented  in  the  paracomputer  model,  originally  appeared  in  [Si].  Snir  [S2] 
credits  Yao  [Y]  with  inspiration  for  the  construction.)  I 

6.  Remaining  Questions 

The  general  fi(n  log  n)  bound  which  we  have  proved  has  a  very  small  constant,  1/(18  log2(9)).  In  contrast, 
the  best  constant  known  for  an  upper  bound  is  around  1.4  [P,DKR).  It  remains  to  close  this  gap.  For  certain 
values  of  n  (such  as  powers  of  2),  we  do  have  a  narrower  gap,  because  the  proof  in  [FL]  produces  a  lower 
bound  with  a  larger  constant.  It  is  possible  that  there  are  certain  properties  of  the  number  n  (e.g.  properties 
of  its  prime  factorization)  that  affect  the  size  of  the  constant.  It  would  be  interesting  to  understand  these 
relationships. 
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